Document Layout Analysis


Document layout analysis (DLA) is the process of analyzing a document's spatial arrangement of content to understand its structure and layout. This includes identifying the location of text, tables, images, and other elements as well as the overall structure, such as headings and subheadings. DLA helps in extracting and categorizing information and automating document processing workflows.

Dolphin-v2: Universal Document Parsing via Scalable Anchor Prompting

Add code
Feb 05, 2026
Viaarxiv icon

Youtu-Parsing: Perception, Structuring and Recognition via High-Parallelism Decoding

Add code
Jan 28, 2026
Viaarxiv icon

Prompt Driven Development with Claude Code: Building a Complete TUI Framework for the Ring Programming Language

Add code
Jan 24, 2026
Viaarxiv icon

FP-THD: Full page transcription of historical documents

Add code
Jan 20, 2026
Viaarxiv icon

PARL: Position-Aware Relation Learning Network for Document Layout Analysis

Add code
Jan 12, 2026
Viaarxiv icon

MathDoc: Benchmarking Structured Extraction and Active Refusal on Noisy Mathematics Exam Papers

Add code
Jan 15, 2026
Viaarxiv icon

IndicDLP: A Foundational Dataset for Multi-Lingual and Multi-Domain Document Layout Parsing

Add code
Dec 23, 2025
Viaarxiv icon

Layout-Aware Text Editing for Efficient Transformation of Academic PDFs to Markdown

Add code
Dec 19, 2025
Viaarxiv icon

Post-Processing Mask-Based Table Segmentation for Structural Coordinate Extraction

Add code
Dec 24, 2025
Viaarxiv icon

LLM-Guided Probabilistic Fusion for Label-Efficient Document Layout Analysis

Add code
Nov 13, 2025
Figure 1 for LLM-Guided Probabilistic Fusion for Label-Efficient Document Layout Analysis
Figure 2 for LLM-Guided Probabilistic Fusion for Label-Efficient Document Layout Analysis
Figure 3 for LLM-Guided Probabilistic Fusion for Label-Efficient Document Layout Analysis
Figure 4 for LLM-Guided Probabilistic Fusion for Label-Efficient Document Layout Analysis
Viaarxiv icon